Goto

Collaborating Authors

 non-local module


Compact Generalized Non-local Network

Neural Information Processing Systems

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos. Although having shown excellent performance, it lacks the mechanism to model the interactions between positions across channels, which are of vital importance in recognizing fine-grained objects and actions. To address this limitation, we generalize the non-local module and take the correlations between the positions of any two channels into account. This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow. Moreover, we implement our generalized non-local method within channel groups to ease the optimization. Experimental results illustrate the clear-cut improvements and practical applicability of the generalized non-local module on both fine-grained object recognition and video classification.


Non-Local Recurrent Network for Image Restoration

Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, Thomas S. Huang

Neural Information Processing Systems

Many classic methods have shown non-local self-similarity in natural images to be an effective prior for image restoration. However, it remains unclear and challenging to make use of this intrinsic property via deep networks.



Compact Generalized Non-local Network

Neural Information Processing Systems

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos. Although having shown excellent performance, it lacks the mechanism to model the interactions between positions across channels, which are of vital importance in recognizing fine-grained objects and actions. To address this limitation, we generalize the non-local module and take the correlations between the positions of any two channels into account. This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow. Moreover, we implement our generalized non-local method within channel groups to ease the optimization. Experimental results illustrate the clear-cut improvements and practical applicability of the generalized non-local module on both fine-grained object recognition and video classification.


Non-Local Recurrent Network for Image Restoration

Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, Thomas S. Huang

Neural Information Processing Systems

Many classic methods have shown non-local self-similarity in natural images to be an effective prior for image restoration. However, it remains unclear and challenging to make use of this intrinsic property via deep networks.



Reviews: Compact Generalized Non-local Network

Neural Information Processing Systems

This paper proposes a novel network module to exploit global (non-local) correlations in the feature map for improving ConvNets. The authors focus on the weakness of the non-local (NL) module [31] that the correlations across channels are less taken into account, and then formulate the compact generalized non-local (CGNL) module to remedy the issue through summarizing the previous methods of NL and bilinear pooling [14] in a unified manner. The CGNL is evaluated on thorough experiments for action and fine-grained classification tasks, exhibiting promising performance competitive to the state-of-the-arts. Positives: The paper is well organized and easy to follow. The generalized formulation (8,9) to unify bilinear pooling and non-local module is theoretically sound.


Efficient Spatialtemporal Context Modeling for Action Recognition

Cao, Congqi, Lu, Yue, Zhang, Yifan, Jiang, Dongmei, Zhang, Yanning

arXiv.org Artificial Intelligence

Contextual information plays an important role in action recognition. Local operations have difficulty to model the relation between two elements with a long-distance interval. However, directly modeling the contextual information between any two points brings huge cost in computation and memory, especially for action recognition, where there is an additional temporal dimension. Inspired from 2D criss-cross attention used in segmentation task, we propose a recurrent 3D criss-cross attention (RCCA-3D) module to model the dense long-range spatiotemporal contextual information in video for action recognition. The global context is factorized into sparse relation maps. We model the relationship between points in the same line along the direction of horizon, vertical and depth at each time, which forms a 3D criss-cross structure, and duplicate the same operation with recurrent mechanism to transmit the relation between points in a line to a plane finally to the whole spatiotemporal space. Compared with the non-local method, the proposed RCCA-3D module reduces the number of parameters and FLOPs by 25% and 11% for video context modeling. We evaluate the performance of RCCA-3D with two latest action recognition networks on three datasets and make a thorough analysis of the architecture, obtaining the best way to factorize and fuse the relation maps. Comparisons with other state-of-the-art methods demonstrate the effectiveness and efficiency of our model.


Compact Generalized Non-local Network

Yue, Kaiyu, Sun, Ming, Yuan, Yuchen, Zhou, Feng, Ding, Errui, Xu, Fuxin

Neural Information Processing Systems

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos. Although having shown excellent performance, it lacks the mechanism to model the interactions between positions across channels, which are of vital importance in recognizing fine-grained objects and actions. To address this limitation, we generalize the non-local module and take the correlations between the positions of any two channels into account. This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow. Moreover, we implement our generalized non-local method within channel groups to ease the optimization.


Non-Local Recurrent Network for Image Restoration

Liu, Ding, Wen, Bihan, Fan, Yuchen, Loy, Chen Change, Huang, Thomas S.

Neural Information Processing Systems

Many classic methods have shown non-local self-similarity in natural images to be an effective prior for image restoration. However, it remains unclear and challenging to make use of this intrinsic property via deep networks. In this paper, we propose a non-local recurrent network (NLRN) as the first attempt to incorporate non-local operations into a recurrent neural network (RNN) for image restoration. The main contributions of this work are: (1) Unlike existing methods that measure self-similarity in an isolated manner, the proposed non-local module can be flexibly integrated into existing deep networks for end-to-end training to capture deep feature correlation between each location and its neighborhood. (2) We fully employ the RNN structure for its parameter efficiency and allow deep feature correlation to be propagated along adjacent recurrent states. This new design boosts robustness against inaccurate correlation estimation due to severely degraded images. (3) We show that it is essential to maintain a confined neighborhood for computing deep feature correlation given degraded images. This is in contrast to existing practice that deploys the whole image. Extensive experiments on both image denoising and super-resolution tasks are conducted. Thanks to the recurrent non-local operations and correlation propagation, the proposed NLRN achieves superior results to state-of-the-art methods with many fewer parameters.